Project 2: Google App Engine
* What: Google App Engine - an infrastructure for scalable web applications * Where: Google App Engine * Who: James Chen & David Underhill * habiba = Overview = The Google App Engine (GAE) is a framework which enables developers to run their web applications directly on the scalable Google infrastructure. The runtime environment supports the full Python language and most of the Python standard library. GAE reduces time to deployment and relieves developers from server-side issues by providing a hosting environment and an easy application upload and management process. It also automatically handles the scaling and load-balancing of the application as the number of users grows. GAE also provides a collection of APIs to simplify common programming tasks: * Datastore API - provides a scalable, transactional storage system for creating, storing, and querying data objects. * Images API - provides simple image manipulation operations. * Mail API - provides a service for sending email messages from web applications. * Memcache API - provides a high performance in-memory key-value cache that is accessible by all instances of an application. * URL Fetch API - provides a mechanism to communicate with other applications and access other web resources through URLs. * Users API - provides a simple way to manage user authentication using Google Accounts. = Web Application Development with Google App Engine = The process of creating a GAE web application is depicted in Figure 1. GAE provides developers with a software development kit (SDK) consisting of two primary components: # The Development Web Server simulates the App Engine environment. It includes a local version of the datastore, Google Accounts, the ability to fetch URLs, send email, etc. This allows developers to fully and safely test their application on their local computers. It also obviates the time-consuming upload-test-modify development cycle. # The Upload Script uploads the entire application to the GAE infrastructure when the application is ready to be launched. GAE supports any framework written in pure Python that implements either a CGI or WSGI-compliant interface. These include Django, CherryPy, Pylons and web.py, among others. A framework can be bundled with an application by simply copying it into the application directory. Alternatively, GAE provides a simple pre-installed framework known as webapp that partitions a web application into three parts: # A WSGIApplication instance that routes incoming requests to handlers based on the URL. # One or more RequestHandler classes that process requests and build responses. # A main routine that runs the WSGIApplication using a CGI adaptor. The webapp framework supports Django's templating engine so developers can keep their HTML in a separate file with special syntax to indicate where the data from the application appears. A simple YAML text file is used to configure a web application. These include the application name, application version, and the mapping of incoming URLs to directories and handler scripts. A directory can be configured to serve static files directly to a web browser. This is useful for handling non-dynamic files like images, CSS stylesheets, Javascript scripts and Flash animations. An example configuration file can be found here. Once an application is uploaded to the Google infrastructure, GAE provides developers with a web-based administration console to oversee their applications (see Figure 2). The console can be used to monitor user traffic, select which version(s) of the application to run, and who can modify the application. = Uniqueness = GAE is ''orthogonal''to many other frameworks. It provides scalable hosting services at a high-level by constraining how applications can access data and otherwise operate. Though there are a few other frameworks which provide a cloud computing infrastructure (such as Amazon's cloud), those approaches give developers access at a much lower and unconstrained level. As a result, they cannot provide any serious level of automated scalability support compared to GAE. Many other frameworks attempt to improve developer productivity by providing reusable web components (like UI widgets) or other abstractions (like templating mechanisms). GAE essentially allows developers to use any of these other frameworks in conjunction with itself. = Critique = Strengths GAE greatly reduces the difficulties associated with developing and serving web applications. It substantially lowers barriers to entry by offering a zero-cost hosting solution which includes important fundamental components like web servers and data storage. The infrastructure automatically facilitates and provides scalability so that popular web applications can handle large volumes of potentially bursty traffic with minimal development effort. This enables developers to focus on implementing new features. GAE also provides a helpful administrative interface and simple APIs to further enhance developer productivity. The administrative interface shares various analytics and provides a web-based facility for executing commands (code) and datastore queries. Finally, GAE provides powerful support for analyzing hosted applications. It can host multiple versions of the same web application, enabling developers to try out new releases on a subset of the user base first. GAE even supports profiling an application while it runs on Google's infrastructure. Weaknesses GAE makes a number of concessions in order to provide scalability. One of the biggest departures from a typical infrastructure is the lack of a relational database. Instead, long-term data is stored in Bigtable, Google's distributed storage system. This datastore forces developers to think in a more constrained way, and may cause serious scalability issues if used as if it were a traditional relational database (e.g. scalable counters cannot be implemented in the obvious RDMS way). To prevent an application from utilizing too many resources, many artificial quotas are imposed. These quotas are currently inflexible and cause substantial headaches for developers, though in the future one will be able to purchase additional resources beyond the free quota. The application may not have more than 1,000 files, making it difficult for developers to utilize libraries which alone often exceed this limit. There is a limit on the size of any file or data structure of 1MB which can be limiting for some applications. Large aggregations of data or even large database exports cannot be done because of various runtime and query limitations. Until these quotas can be increased, the range of web applications that can be effectively developed will be limited. Other restrictions are imposed on the basis of security. For instance, GAE only permits communication over HTTP connections on standard HTTP and HTTPS ports. It does not allow multi-threaded code either. Though these restrictions seem unlikely to affect many applications, they can certainly be problematic for some. For instance, it might be nice to do some computation while waiting for a URL fetch to be completed. Finally, GAE currently forces developers to write Python code. This excludes some developers. More importantly, it excludes some applications which might need some critical functionality to be implemented in a more efficient language. It also excludes existing libraries which are not available in Python. = Is GAE right for your application? = Applications which need performance Python cannot deliver, properties only available in relational databases, or access to bare metal machines (or virtual equivalents) would not be well-suited to using GAE. Applications with these kind of requirements would be better-suited to Amazon's web services or Microsoft's Azure which provide lower-level and more generic abstractions than GAE. GAE is best-suited for small and medium-sized applications. It provides a solid, productive infrastructure which automates much of the difficulty usually associated with creating scalable web applications. = Resources = * Google App Engine Documentation * Ashcraft, Ken. Best Practices - Building a Production Quality Application on Google App Engine. Google IO 2008. * Slatkin, Brett. Building Scalable Web Applications with Google App Engine. Google IO 2008. * Barrett, Ryan. Under the Covers of the Google App Engine Datastore. Google IO 2008. * Fay Chang et. al. Bigtable: A distributed storage system for structured data. OSDI 2006. * Presentation by Guido van Rossum. Google App Engine: Run your web applications on Google's infrastructure. Stanford EE Computer Systems Colloquium. 5 Nov 2008. * Coté, Michael. Clouds Rolling In: The Google App Engine Q&A. 2008-04-08.